How to find decision makers in neural circuits? 



Alexei A. Koulakov 1 , Dmitry Rinberg 2 , and Dmitry N. Tsigankov 1 

1 Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA 
2 Monell Chemical Senses Center, Philadelphia, PA, USA 

Neural circuits often face the problem of classifying stimuli into discrete groups and making 
decisions based on such classifications. Neurons of these circuits can be distinguished according 
to their correlations with different features of stimulus or response, which allows defining sensory 
or motor neuronal types. In this study we define the third class of neurons, which is responsible 
for making decision. We suggest two descriptions for contribution of units to decision making: 
first, as a spatial derivative of correlations between neural activity and the decision; second, as 
an impact of variability in a given neuron on the response. These two definitions are shown 
to be equivalent, when they can be compared. We also suggest an experimental strategy for 
determining contributions to decision making, which uses electric stimulation with time- varying 
random current. 
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I. INTRODUCTION 

Nervous system is continuously confronted by 
megabytes of information, representing light, sound, 
smell, etc. This information is compiled by the brain 
into a set of decisions, representing behaviors of living 
organisms. The mechanisms involved in this reduction 
have been under investigation for many years (Glimcher, 
2003; Romo and Salinas, 2003). In this study we address 
a question complimentary to the issue of decision making 
(DM) mechanisms. We define neuronal units involved in 
making perceptual decisions. For this purpose we deter- 
mine DM activity in surrogate networks, defined math- 
ematically, in which a complete control is present over 
stimuli, mechanisms, and responses. Such decision mak- 
ing analysis (DMA) has practical significance, since once 
units involved in making particular decision are located, 
further efforts could be concentrated on uncovering the 
underlying mechanisms. 

In this study DM task is defined as evaluation of a 
function in the multidimensional stimulus space (Figure 
1A). This function has a discrete set of values, represent- 
ing the repertoire of responses available to the organism. 
The decisions may, of course, be stochastic, to reflect 
the uncertainty, pertinent to behavior. This definition 
is suitable for experiments where subjects perform poly- 
alternative forced-choice tasks, such as saccadic response 
to the direction of stimulus motion (Shadlen and New- 
some, 2001). 

Let us consider motion-discrimination task in more de- 
tail. Figure IB lists some visual areas, which are in- 
volved in this task. The areas are arranged along a rough 
sensory-motor axis, so that the areas on the left are more 
"sensory", while those on the right are more "motor". 
This implies that the responses in these areas are more 
correlated with stimulus or response respectively. Where 
on this sensory-motor axis one should position the DM 
elements? One could argue that the elements most cor- 
related with the decision itself are the decision makers, 
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FIG. 1 A, Definition of decision making task. Nervous sys- 
tem evaluates a function, whose values represent discrete de- 
cisions, in the many-dimensional sensory space. B, Some of 
the visual areas involved in motion-discrimination task. The 
areas on the left are more sensory (response is correlated with 
the sensory input), while those of the right are more motor 
(correlated with the response). 



following the analogy with the definition of sensory and 
motor elements. It is, however, difficult, if not impossi- 
ble, to distinguish such definition from the definition of 
purely motor units (Shadlen and Newsome, 2001). The 
latter relay the results of decision making process, with- 
out involvement in the formation of the decision. An al- 
ternative approach is therefore needed to define the DM 
units. 

The DM components may be surmised to be located 
on the interface between sensory and motor areas. More 
precisely, the first element in the sensory-motor chain, 
which carries significant correlation with the response, 
may be identified as the decision maker. In this study we 
develop this idea into rigorous mathematical formulation 
and find a special correlation function, which determines 
contributions of units to DM. This formalism allows us 
to answer two questions pertaining to the identities of 
DM units. First, we consider the case when not one but 
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several elements are involved in the same decision simul- 
taneously. Our approach allows us to evaluate relative 
importance of various units in such a distributed DM. 
Second, we consider the systems with loops in connectiv- 
ity. For such systems the concept of 'the first element' 
becomes more arbitrary and one has to proceed more 
carefully in defining contributions to DM. We succeed 
in doing so for our surrogate networks and define DM 
units for recurrent networks in a way, which is consistent 
with the linear sensory-motor chains, thus satisfying the 
requirement of the correspondence principle. 

This paper is organized as follows. We first analyze 
simple linear chain models, and networks, such as trees, 
which have similar properties. We then use this anal- 
ysis to define decision makers in networks of arbitrary 
connectivity. Finally, we extend our study to the cases, 
when electric stimulation can be applied to units, and 
show that DM components can be identified in a way 
consistent with our preceeding analyses. 

II. LINEAR CHAINS AND THEIR DERIVATIVES 
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FIG. 2 A, a simple 'nematode' network consists of a linear 
chain of units. All units in the chain, but the last, are linear. 
The last unit, shown by the square, is non-linear and returns 
zero or one depending on the sign of the response of the pre- 
ceding unit. B, The average input-output relationship for the 
'nematode' is given by the sigmoid function (error function). 
The spread of the sigmoid is determined by the net noise in 
the chain. 



The goal of this section is to formulate quantitative 
principles by which DM network elements can be iden- 
tified. We approach this task by analyzing simple cases, 
which can be solved exactly without the use of computer, 
and in which the identities of DM elements are clear. 
These cases allow us to emphasize the properties of DM 
task we are attempting to describe. We proceed therefore 
to the analysis of the simplest network capable of making 
decisions. 



A. The 'nematode' network 

In this subsection we consider the network, which we 
call 'nematode', because of its resemblance to simple bi- 
ological organisms, both in the layout and in the funda- 
mental significance. We first define the model; then show 
that it can make simple decisions; and, finally, define the 
positions of decision makers in the network. 

Consider a linear chain of units, whose response is char- 
acterized by a set of real numbers Xi, where i = 1...N is 
the position of the element in the chain (Figure 2). Re- 
sponse of each element does not depend on time. This 
model is therefore static. This assumption is introduced 
here to simplify the analysis and can be relaxed as de- 
scribed below fsection lll.Cf) . Each unit performs a sim- 
ple linear transformation between the unit's input and 
the output. Thus, for element number i 

Xi=Xi-x + r)i (1) 

Here r\i is noise associated with the element. In this work 
we assume that noise has zero mean, is individual to each 
unit, and, therefore, is uncorrelated between units, i.e. 

* = 0, m = * (2) 



We further assume that noise has a Gaussian distribu- 
tion. The chain of linear elements is thus completely 
specified by a set of noise variances rjf. The model de- 
scribed by and yields the following solution for 
the response of the last element in the chain 

Xn = x Q + 771 + 772 + ... + tjjv-i + VN- (3) 

Thus, the response of the last element is just a sum of 
the input into network xq and noise contributions from 
all units, independently on the order of unit in the chain. 

The last element in the chain has non-linear response 
properties. Its response is defined by 

d = H(x N ), (4) 

where H(x) is the Heaviside step function, which is equal 
to one/zero if the argument is positive/negative. It fol- 
lows then that our 'nematode' network is capable of mak- 
ing decisions based on the values of input variable xq. 
This is if we interpret variable d, which is equal either 
or 1, as the result of DM process, as defined in Figure 
1A. The decisions are made stochastically and are de- 
pendent upon the instantiations of random variables 77^ , 
which vary from trial to trial. 

Our model is completely defined by the set of noise 
variances, pertinent to each unit 77?. Although decisions 
made by this chain are quite simple, the identities of 
decision makers are not so easy to find. The distribution 
of impact to DM along the chain should depend upon 
the distribution of noise variables rjf. Our next goal is 
to develop a sensible definition of contributions to DM 
based on the vector of variances rjj . Before doing so we 
describe general input-output properties of the chain. 

Since decision made by the network varies from trial to 
trial, one can define averaged over trials response of the 
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FIG. 3 A, If signal-to-noise ratio is high, responses of all 
units are well correlated with the output, as shown on the 
right by the mutual information between the response of given 
unit and the output. B, For the case of low signal-to-noise ra- 
tio, the output is more correlated with the motor units (right) 
than with the sensory ones (left). 



system d(xo). As shown in Figure 2B it has a sigmoid 
shape, smeared by the total amount of noise in the sys- 
tem. One can, therefore, consider two cases, depending 
on whether the signal-to-noise ratio for the chain is large 
or small. These two regimes are shown in Figure 3 A and 
B respectively. 

To analyze responses of units in these two cases we de- 
fine their correlation with the decision. This correlation 
is defined for each element in the chain (Figure 3, right). 
As a measure of correlation we choose mutual informa- 
tion (MI) between response of the i—th unit, Xi, and the 
decision, d. MI has an advantage of being unitless (it 
is measured in bits) and having clear intuitive proper- 
ties, as described below. We will also show below in this 
section that MI has limitations as a measure of DM. 

MI describes the information transmission from the 
i— th unit to the output of the system. Since the out- 
put can only have values or 1, MI cannot exceed the 
value of one bit. We now consider two cases, depending 
on the network's signal-to-noise ratio. If network input 
|xrj| is large, as in Figure 3 A, response of the system is 
well correlated with the input. Hence, activities of all 
units are well correlated with both input and output, 
and MI(xi,d) w 1 for all of the units. In the oppo- 
site limit, when the signal-to- noise ratio is small, \xq\ is 
smaller than noise, and the system's response is weakly 
correlated with the input (Figure 3B). In this case MI as 
a function of unit's position displays a structure, shown 
in Figure 3B (right). This structure, as shown below, has 
a key to the definition of DM components and is quali- 
tatively discussed here. The units, which are close to the 
exit from the network, show strong correlation with the 
decision, similarly to the high signal-to-noise ratio case. 
Their MI is therefore close to 1 bit. On the other hand, 
more 'sensory' units, in the beginning of the chain are 
strongly correlated with the input. Since input-output 
correlation is weak in low signal-to-noise ratio case, the 
'sensory' units display virtually no relation to the out- 
put and MI(xi,d) ~ for such units (Figure 3B, right). 
Thus, MI, as a function of i displays a transition from 



to 1 in the low signal-to-noise ratio case. 

How could one deduce identities of decision makers 
from these dependencies (Figure 3 A and B)? One could 
suggest that the elements perfectly correlated with the 
output of the system, such as exit elements from the 
chain, are the ones that make the decision. However, such 
elements may be just the relay or 'motor' units, in which 
case their contribution to DM is small. Indeed, when we 
type, our decisions are perfectly correlated with activities 
of finger muscles; but one could hardly blame our fingers 
for the content of the typing. Thus, despite their high 
correlation with the output, exit elements could not be 
called decision makers. Input elements, having no corre- 
lation with the decision, are responsible for DM in even 
lesser degree. We thus need to analyze the dependence of 
MI on position in more detail and suggest another scheme 
for denning DM units. 

Our discarding of motor units as decision makers can 
be further extended onto the entire high signal-to-noise 
ratio case (Figure 3A). We suggest that the deterministic 
regime is not descriptive from the point of view of DM 
analysis. First, in this regime all units become indistin- 
guishable from motor. The latter are not decision mak- 
ers, as suggested above. Second, the dependence shown 
in Figure 3A (right) does not reveal the contributions of 
individual units to the decision. Since all units have the 
same correlation, it is hard, if not impossible, to differ- 
entiate them and assign different contributions. Third, 
the responses of units in this case are deterministically 
related to the input. Hence, units act as relays, passively 
transmitting information along the chain. It can be ar- 
gued that the external environment, providing the input 
variable xq, acts as the decision maker. We conclude that 
to find decision making activity one has to concentrate 
on the low signal-to-noise ratio case. 

We show below that the identities of decision making 
units can be deduced from the shape of transition in Fig- 
ure 3B (right). To this end we analyze a set of examples 
of networks with various distributions of noise rjf. We 
start from the simplest example of a single noisy unit. 



1. Example 1: 'Noisy' neuron. 

Consider a chain in which noise is absent from all units 
but one, whose order number in the chain is n (Figure 4). 
Since, according to our previous discussion, we need to 
consider the low signal-to-noise ratio case, we will assume 
that 



x = 0, 



(5) 



i.e. network receives no input. Making the decision in 
this case is still possible, based on the values of noise 
inside the network. Since noise is only present in one 
neuron, from @ we conclude that 



X N = Tjr, 



(6) 
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The decision made by the network is 

d = H{r ]n ). (7) 

Thus, decision is causally linked to the processes control- 
ling unit number n, which leads us to conclusion that this 
neuron is the decision maker. 

Paradoxically, the noisiest unit in this simple formu- 
lation makes the largest impact. All noiseless elements, 
even nonlinear, are deterministic, and work as simple re- 
lays which transmit information from the previous node 
to the next one. The output of the circuit is linked to the 
processes controlling noise in neuron number n, rather 
that in any other neuron in the network. 

One would be tempted to conclude that the non-linear 
element is actually the decision maker in this case. We 
deduce that the non-linear element does not have a causal 
effect on output from the circuit; therefore its role is just 
to relay response from neuron n to the output. In this 
respect the non-linear element is not different from other 
noiseless elements. 

To link this example to our previous discussion (Fig- 
ure 3B) we plot MI as a function of position in the chain 
in Figure 4 (top). As we discussed, MI is high for exit 
('motor') units and low for input ('sensory') elements. 
Figure 4 also shows the derivative of MI with respect 
to position in the chain. It is clear that this derivative 
represents the decision making element. Thus, we con- 
clude that not correlation with the decision but the rate 
of change of the latter along the network is the indicator 
of DM. 
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FIG. 4 The example of 'noisy' neuron (marked by asterisk). 
Top panel, mutual information between given unit and the 
decision. Bottom panel, derivative of mutual information. 
The derivative represents the decision making unit in this 
case. 



2. Example 2: Uniformly distributed noise. 

Our next example shows that the conclusion about 
derivative of MI is basically correct, but has to be slightly 
amended to be numerically precise. Consider the chain 
in which all elements are noisy and the variance of noise 
is the same for each element. In this case 

xn = r)i + r) 2 + — + VN-i + Vn (8) 



i.e. all units contribute to decision equally. This is be- 
cause Eq. (JHJ) does not distinguish the order in which con- 
tributions from the units are added, and all contributions 
are of equal strength on average. Can this conclusion be 
confirmed by the derivative of MI? 

Figure 5A shows MI as a function of position in the 
chain for this case. This dependence is obtained in Ap- 
pendix A. It increases smoothly from to 1 resulting in 
a non-zero derivative at all units. This is consistent with 
and the notion that all units participate in the deci- 
sion. However, © suggests that all units participate in 
decision equally. The derivative of MI turns out to be 
slightly non-uniform, as seen in Figure 5A. This can be 
corrected if not MI itself but a non-linear function of MI, 
denoted F(MI), is considered. This non-linear function 
is calculated in Appendix A and is shown in Figure 5B. 
The new correlator F(MI) has the same basic properties 
as the ML It rises from to 1 monotonously when pass- 
ing through the array (Figure 5C). But, in addition, its 
derivative turns out to be uniform, as shown in Figure 
5C (bottom) . This is consistent with equal participation 
of all units in DM in the uniformly distributed noise case 
and Eq. (JSJ). Thus, we conclude that for this case the 
contributions to DM are given by the rate of increase of 
F(MI) when moving through the array 

DMi = F (MI,) - F {Mli-i) . (9) 

Here i is the index along the chain. Eq. ljUJ) is the main 
result of this paper. It represents our definition of contri- 
butions to DM for networks of simple connectivity, such 
as chains. 

Three points should be made about the definition J§J. 
First, it reproduces the result obtained in the previous 
example of 'noisy' neuron. Indeed, the mutual informa- 
tion rises from to 1 on the 'noisy' neuron in Figure 
4. But F(MI) coincides with MI at these values, as fol- 
lows from its plot in Figure 5B. Thus, the derivative of 
F(MI) is also given by a single spike at the position of 
'noisy' neuron, as in Figure 4 (bottom). Second, Eq. @ 
implies that, from point of view of DM, not mutual in- 
formation, but another correlator, given by F(MI), is 
more relevant. Function F deviates from linear function 
only slightly (Figure 5B), and for practical purposes the 
distinction between the MI and F(MI) could be ignored. 
However, we retain it throughout the manuscript to en- 
sure mathematical rigor. Third, when deducing © we 
did not postulate that contributions to DM are propor- 
tional to the variance of noise. Instead, we suggested that 
Eq. (JSJ) implies that all units contribute equally, indepen- 
dently on the order in the chain. This simple qualitative 
statement is powerful enough to constrain our quantita- 
tive reasoning and lead to a measure of DM in form of 
function F(MI) and definition ©. We do not know yet 
if the derivative of F(MI) is proportional to the vari- 
ance of noise, square root of this variance, or any other 
characteristic of noise in each element. All of these pa- 
rameters give the same results in the uniform noise case. 
We need to have a difference between units to measure 
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relative strength of their contributions. This is achieved 
by the next example. 
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FIG. 5 The example with uniformly distributed noise. A, 
mutual information between response of given unit and the 
decision. The dependence has a non-uniform increase, sug- 
gesting that mutual information is not a good measure of de- 
cision making. B, if one applies a non- linear function (solid 
curve) to the mutual information in A, one obtains a uni- 
formly increasing correlator in C. This non-linear function, 
called F(MI), is close to linear, shown by the dotted line. C, 
the new correlator F(MI) (top panel) has a uniform deriva- 
tive (bottom panel). Thus, derivative of F(MI) is a sensible 
measure of decision making in the case of uniform noise. 



3. Example 3: 'Loud' neuron. 

In this example the variances of noise on all neurons are 
the same, similarly to the previous case. However, here 
we amend the network definition given by ftp. We do so 
for only one neuron. We assume that the link between 
units 5 and 6 is characterized by a very large strength 
K » 1. Thus, for neuron number 6 (Figure 6) instead 
of we have 
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Kx* 



(10) 



Therefore this example is the same as the previous, ex- 
cept that the single network connection is changed. What 
are the DM units in this case? 
The network's output is given by 



but their contribution is much smaller than that of the 
former group. We conclude that units 1 through 5 are 
much stronger decision makers than units 6 through 11. 
This conclusion is supported by the derivative of F(MI), 
as shown in Figure 6 (bottom). 

Derivative of F(MI) 



FIG. 6 The 'loud' neuron example. The link between units 
5 an 6 is strengthened. Compare to Figure 5A. 

Thus, changing one link in the chain produces large 
effect on the distribution of DM. The units downstream 
from the link contribute less to decisions, while the units 
upstream contribute a lot. What is the measure of deci- 
sion making, which could differentiate these two types of 
units? 

Calculations in Appendix A show that derivative of 
F(MI) is proportional to K 2 for units 1 through 5. This 
is easy to understand qualitatively, since MI increases 
along the chain even for negative (K < 0) links. This 
is not possible if contribution from units 1 to 5 are mul- 
tiplied by K for example. Thus, an even power of K is 
required, which is shown in Appendix A to be K 2 . 



4. Alternative definition of DM. 

So far we have used definition @, which is quite com- 
plex, since it involves calculation of a nonlinear function 
F(MI). Is it possible to reproduce the results derived 
above in a simpler way? It turns out that the role of 
given unit in DM is proportional to its contribution to 
the variability of the output x 2 N . This leads us to an 
alternative to © definition of DM. 

Let us introduce the new definition using the examples, 
considered above. From (|ll|l in the 'loud' neuron case we 
derive 



4) 



(12) 



We could conjecture that the contributions to DM from 
different units are weighted proportionally to the corre- 
sponding summands in 1|12[). Indeed, if we assume 



Ml.5 = 1,1 



■X 2 



x N = K(r)i +i] 2 + ... + 775) + m + ■■■ + Vu 



(11) 



Thus, units 1 through 5 contribute equally to decision. 
In addition, their contributions are multiplied by a large 
factor K. Units 6 through 11 also contribute equally, 



(13) 



by choosing appropriate values of variance of noise and 
gain, we can reproduce the results of all three of our pre- 
vious examples. Thus, in the case of 'noisy' neuron the 
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variance of noise is only present in one unit, rendering 
this unit decision maker, according to (|13[) . In the case 
of uniform noise, when K = 1 and all rjf n are the same, 
(|13fl gives uniform contributions to DM. In the case of 
'loud' neuron, H13[) gives the correct factor K 2 describing 
the advantage of upstream neurons. Thus, the contribu- 
tions to DM are proportional to the variance of noise on 
given element, multiplied by the square of the gain from 
this element to the output. We can rewrite in a more 
compact form to emphasize this latter statement 



DM, 



W T 2 

9 ax /V 



(14) 



One could verify (|14|) , by applying it to i|12|) and obtain- 
ing relationships (|13fl . This justifies l|14|) in the three 
examples considered above. 

Eq. I|14ll also applies to linear chains in general. In Ap- 
pendix A we derive (| 1 41) from previous definition @ for 
arbitrary distribution of connection strengths and noise. 
Thus, <|14|l can be considered an alternative definition 
to @. The equivalence between © and (|14|l is demon- 
strated graphically in Figure 7. 

Why should one consider an alternative definition? 
This is because (0 cannot be applied to networks of arbi- 
trary connectivity, such as circuits containing loops. Def- 
inition l|14f> however applies to all topologies, including 
the linear chain examples, considered here. 
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FIG. 7 Equivalence of two definitions. The top panel 
shows distribution of noise variance (asterisk diameter) and 
of F(MI) (bars). The bottom panel displays the derivative 
of F(MI), defined by @. The derivative is numerically the 
same as the variance of noise. Both can be used as measures 
of decision making. 



B. Conclusions from 'nematode' study 

Let us review our findings. First, we arrived to the def- 
inition of DM activity using the information-theoretical 
approach According to this definition, DM is the rate 
of change of correlation with decision along the chain. In 
other words, the first element or elements, which cor- 
relate with the decision, are the decision makers. This 
approach has its pros and contras. Indeed, the viewpoint 
expressed by @ has a potential to be transferred to other 
systems, which contain non-linear elements. Eq. @ has 
an information-theoretical origin; hence its applicability 



may be broader than our simple system. Another ad- 
vantage of 10 is that it relies on the characteristics mea- 
surable in single-electrode recording experiments, such as 
response of single unit and its correlation with behavioral 
decision. Thus, could be used experimentally. The 
disadvantage of the information-theoretical approach is 
that it is not clear how to apply it to the systems with 
loops, as we have mentioned above. Since biological net- 
works almost always contain loops this significantly limits 
the applicability of information-theoretical formula 

Our second step was to derive an alternative definition 
(11411 . The latter is equivalent to the former definition © 
for linear-chain ('nematode') example, as we have demon- 
strated on simple examples and have shown more rigor- 
ously in Appendix A. The alternative definition Ijl4(l can 
be understood on the basis of the following two obser- 
vations. First, the example of 'noisy' neuron shows that 
the variability is the source of decisions. Thus, 

Conclusion 1: Under fixed other conditions, an in- 
crease in variability and noise in a single unit leads to a 
larger contribution to DM from this unit. 



DM; 



(15) 



Second, the example of 'loud' neuron shows that not 
only variability and noise are important but also how 
much of this variability reaches the motor units. DM is 
hence a property of network connectivity too. Thus, we 
arrive to the next rule 

Conclusion 2: The stronger is the pathway from 
given unit to the motor output, the larger is the con- 
tribution of this unit to DM. 



dr 2 



(16) 



These two rules are combined into the definition 114f) . 
Although O and (HHJ) assume that the output element 
is unique, this requirement will be removed below, when 
we consider arbitrary topology networks. 

What are the features of (|14fl ? It could be used for 
an arbitrary topology network, since it does not contain 
derivative along the chain, as 10 does. Definition 114|) 
can also be used operationally to measure the contribu- 
tion of each neuron to the decision experimentally. To 
do that one needs to vary noise at the given unit and 
measure the variability of the responses. The details are 
discussed in section II VI below. 

A special note should be made about normalization in 
(|14fl . Throughout this work we adopt the convention that 
DM contributions are evaluated for all units and then 
normalized proportionally to Ijl4|l . so that the total sum 
of all contributions is equal to one (or 100%). We will 
assume this to hold below without explicitly mentioning. 
Finally, we give another definition of DM contributions, 
which could be useful when noise in the system is the 
same for all units. In this case the only difference between 
units is due to difference in their position in the network. 
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We therefore call such quantity topological DM. 



TDM, 



da 2 (x N ) 



(17) 



As seen from e.g. (|12|l it does not depend on the lev- 
els of noise, and can be obtained from Ijl4(l by assuming 
that rjj = 1 for all units. It therefore describes how 
strongly each elements of the circuit affects the output. 
This quantity is sometimes helpful in describing the net- 
work's topology. 

Lastly, we discuss the notion of noise and variability in 
our approach. Is this really noise, which leads networks 
to decisions? Not necessarily. Imagine that we have 
studied a chain-like network (Figure 8A) and performed 
the DM analysis, described above. We found that the 
network contains two decision makers, which are equally 
important. A more thorough investigation may suggest 
that these units are inputs from external network, which 
in effect is responsible for DM. For example, these hidden 
pathways may be inputs from other sensory modalities or 
regulatory inputs of other type. Thus, DMA may help 
identify entry points from other, less studied, parts of the 
network. 
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FIG. 8 Hidden pathway. The intensity of red shows con- 
tribution to decision making for each unit. A, analysis for 
an incomplete connectivity reveals two decision makers. B, a 
more thorough study may show that this results from other 
inputs to the network. 



Sij = \sign(Cij)\. Connectivity matrices for some net- 
works are shown in Figures 9 and 10. 
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FIG. 9 Mutual information approach can be extended to 
connectivities other than linear chains (A). Thus, decision 
makers on trees (B) can also be found. Arbitrary network 
can be specified by connectivity matrices, which are provided 
for illustration purposes. The non-zero entries in a connectiv- 
ity matrix indicate a connection between two elements num- 
bered on the left. An entry value describes the strength of 
connection and does not have to be unitary or positive. 



Information-theoretical approach can be even further 
extended on the cases, when signals propagate along the 
network in time, therefore resulting in delays between 

signal and response. In this case by / one should un- 
derstand a sum of correlations over all times preceding 
the decision. This compensates for the presence of de- 
lays. So far there is no understanding if Eq. 1|18|) [or 
can be used for topologies other than trees. Definition 
(|14f) . however, can be used with networks of arbitrary 
connectivity. This is the topic of the next section. 



III. DYNAMIC MODELS 



C. Trees 

Our studies indicate that information-theoretical anal- 
ysis [definition can be further extended to tree-like 
topologies (Figure 9). To this end we define column- 
vector / such that fi — F{MIi). Then © is equivalent 
to 

DM = (7 - S)f. (18) 

Here S is the structure matrix defined as follows. An 
element Sij of the structure matrix is equal to 1 if 
there is a connection from unit number i to j. Matrix 
I—S thus implements evaluating differences between con- 
nected elements in @ ■ Structure matrix is related to con- 
nectivity matrix, containing network's weights through 



All previous examples, except the one mentioned at the 
end of the last session, were static, i.e. variables did not 
depend on time. The deficiency of this approach is that 
it is not clear how to treat networks with loops. To ap- 
ply our analysis to the cases with loops, and, in general, 
to networks with arbitrary connectivity (Figure 10), we 
consider time-dependent models here. This allows us to 
observe propagation of noise around the loop explicitly 
and to make accurate conclusions about contributions to 
DM. 

We limit ourselves to linear dynamical systems, where 
the single nonlinear element is the last one, transforming 
an analog system output to a binary response. As the 
first step we consider temporal dynamics in the discrete- 
time approximation, which contains all essential features 
of our approach. Later in the section we extend discrete 
model to the continuous-time case and show their equiv- 
alence. 
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FIG. 10 The network topologies considered by the dynamic 
models. Arbitrary connectivities, such as cycles (left) or feed- 
forward networks (right) can be considered. 



A. Discrete-time model 

In this section we consider a system of N elements, 
whose activity at each instant is described by an N- 
dimensional column- vector x(t). Time has discrete val- 
ues separated by an interval r. Therefore this model is 
called the discrete-time model. The values of activity at 
two neighboring time-slices are related by the connection 
matrix C 



x{t + t) = Cx(t) + rf(t) + s(t) 



(19) 



Here ff(t) is the vector describing noise added to activ- 
ity vector on each time-slice. The variable s(t) describes 
sensory input into the system. The rules of temporal evo- 
lution of activities described by this equation are general 
enough to include almost all interesting phenomena and 
mimic modeling of real systems on digital computers. In 
appendix B we will prove that this model is equivalent 
to systems with continuously defined time. 

Noise is specified by the parameter fj(t), which has a 
zero mean and is defined by the correlation matrix 



(20) 



We assume here that neighboring in time values on noise 
are not correlated, implying that we consider a system 
with white noise. This assumption can be easily relaxed 
and is used here to simplify the analysis. It becomes 
rigorously valid when time-interval r is longer than the 
correlation time of noise. Further, if noise is specific to 
each neuron, the same-time correlation matrix M is di- 
agonal 



A4 



(21) 



This takes place i.e. when stochasticity is induced by 
probabilistic nature of synaptic vesicle release, in which 
case every two neurons receive uncorrclatcd fluctuating 
inputs. 



Some time after presentation of the stimulus [s(t) ^ 
0] the system is forced to make a decision through the 
following process. First, a scalar quantity 



y = v T ■ x(t) 



(22) 



is evaluated. Here time corresponds to the instant, when 
the choice is to be made. The output metrics vector v 
describes the way in which system's activity affects mo- 
tor response. In the simplest case, which was considered 
in the previous section, when a single element number n 
evokes responses, Vi = Si n . In a more complex situation, 
when multiple areas/neurons have direct influence on de- 
cision, vector v has more than one non-zero element. On 
the second step, decision is made based on the sign of y 



d=H(y) 



(23) 



Thus, this model describes a two-alternative forced- 
choice task. 

Our system is completely defined by the following set 
of parameters: C, s(t), Af, and v. As we have shown in 
the previous section, the presence of the stimulus is not 
required to define DM elements [Eq. (0)]. We therefore 
set s(t) to zero and are left with three parameters C, AI, 
and v. We now are ready to determine DM elements in 
our simple model. 

To find decision makers we will use Eq. 114JI . In this 
case it becomes 



DM, = A4 



dAfu 



(24) 



Therefore, we need to evaluate the variability on the out- 
put from the system c 2 (y). This is accomplished if we 



notice that y = 



x(t) = x (t) ■ v and 



xit^it) ■ v = v 1 X(t,t)v (25) 



Here we introduced the cross-correlation matrix defined 
as follows 



X(n,k) = xin^ik) 



( Xl 



( Xl 



XN 



(26) 



We replace here the time variable by the integers, spec- 
ifying the time-slice number. The averaging in 125fl and 
(12(it is assumed over different instantiations of noise (tri- 
als). 

Due to the properties of noise in our model, this cor- 
relator does not depend on the absolute values of time 
(n and k), but only on the difference (n — k). As follows 
from 1)25(1 . of particular interest is the same-time corre- 
lator Xo = X(n,n), which determines fluctuations in y. 
We now derive equation for same-time correlator Xq. 

Using (|TU|l we obtain 





= x(n + l)^ 


(n + l) = 




Cx(n) + ff(n) 




x r (n)C T + ff r (n) 



(27) 
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We then notice that the correlator x(n)ff r (n) is identi- 
cally zero, since x(n) is a linear combination of values of 
noise at times k < n [see Eq. (fTT?f) ] . We thus deduce from 
Eq. (H3 that 



X Q = Cx{n)x r {n)C T + rf^ffin), 
which leads us, finally, to 

X - CX Q C T = JV 



(28) 



This equation allows us to determine the same-time cor- 
relator Xq from connectivity and noise cross-correlogram, 
defined in l|2(J|) . which is a diagonal matrix. 

We would like to pause here and describe the properties 
of this equation. First of all, in the most generic case 
(|28[) allows us to determine Xq from C and N uniquely. 
Indeed, l|28[) is a system of N 2 linear equations for N 2 
unknowns Xq, arranged in the matrix form. Hence, this 
system, in most cases, can be solved uniquely. On the 
other hand, with one exception, Xq cannot be expressed 
explicitly in terms of matrices C and A/. Thus, one has 
to either appeal to the representation of Xq in terms of 
eigenvectors and eigenvalues of C, or use computer to 
arrange elements of matrix Xo in vector form and solve 
resulting linear system. 

The contribution to DM from a given element can be 
determined from Eq. (|25|) 



DAL = M 



8a 2 (y) 



-IT 9Xq _, 
V — r^-V 



dAAi <91nM 
The topological DM contributions are 



1 DMi = v - , r v 



(29) 



(30) 



Using Eqs. I|28|l and l|29|) one can analyze a variety of 
network connectivities. Some new effect emerging for 
non-tree systems are described next. 



B. Case 1: fan-out hub effect 

We now consider network shown in Figure 11 A, in 
which all elements have the same variance of noise and 
all connections have unitary strength. Figure 11A shows 
two pathways from unit 2 to the exit unit, 6. The result- 
ing network gain from unit 2 to unit 6 is thus equal to 
two. All other units' gain at the exit is one. The contri- 
bution to DM from unit 2 is thus four times larger that 
from other units. This is because noise at this unit is 
multiplied by a factor of two, and the variance of noise, 
by a factor of four. We conclude that there may be some 
special elements in network, which occupy hub-like posi- 
tions, gaining large influence due to abundance of their 
outputs. It should be noted that fan-in hubs are not 
special from the point of view of DM in any way. 



B 



FIG. 11 Two cases, in which the identities of decision makers 
can be found using discrete-time approach. Variance of noise 
on all elements is the same; all network links have unitary 
strength. The degree of decision making is shown by the 
intensity of red. A, The fan-out effect. B, the temporal 
integrator. 



C. Case 2: temporal integrator 

Let us now examine the network with a loop. Figure 
11B shows such an example with unitary link strength 
and uniform noise variance, as in previous case. The 
presence of loop affects DM drastically: our discrete-time 
model marks units belonging to the loop as decision mak- 
ers 1 . This is easy to understand, since noise, generated 
by each unit on each time-step, cannot leave the loop and, 
therefore, builds up there without limits. Therefore the 
variance of noise in the output of element number three 

grows proportionally to time [xz(t) — X3(t)] 2 — r/ 2 t — > oo. 
Here averaging is assumed over instantiations of noise 
(trials). Thus, loop becomes the crucial decision maker. 
This case is somewhat analogous to our previous 'noisy' 
neuron example. 

What is the possible role of loops in biological net- 
works? Why would one introduce such unreliable com- 
ponents? Loops, similar to shown in Figure 11B, have 
many useful properties. For instance, they can act as 
parametric memory systems. Indeed, imagine that re- 
sponses of all units in the loop have the same values, 
equal to x. This could be accomplished by manipulat- 
ing the sensory inputs. Assume that no more inputs are 
received from the outside of the system. It follows that, 
in the absence of noise on each element, this value of re- 
sponse will reverberate around the loop forever. This is 
because all links have unitary strength. Loops can thus 
memorize a graded value, such as x, functioning as para- 
metric memory elements. 

Suppose, in addition, that a non-zero input s is applied 
to element number 1 at all times. Since this element acts 



1 Rigorously speaking, the set of equations 1281 and 1291 does not 
have a valid solution for the loop with all connection equal to 
unity. One needs to set one of the connection as a parameter, 
a < 1, solve the equations, and consider the limit a —* X. 
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as a summator, its response on the next step is x±(l) = 
x + s. The signal s propagates around the loop, and 
in four steps it reaches the first element again, at which 
time its response is a?i(5) = x + 2s. In four more steps 
xi(9) = x + 3s. Thus, not only noise, but also signal can 
build up in the system. Therefore, a loop can operate as 
a temporal integrator. The integration is not perfect if 
one of the links has a non-unitary strength, in which case 
integrator becomes leaky (Robinson, 1989). 

Temporal integrators play special role in DM, since 
they act as accumulators of sensory information, which 
puts them into special position with respect to other ar- 
eas (Gold and Shadlen, 2002). As an example, such is 
area LIP in primate visual cortex, which is involved in 
DM in direction-discrimination task (Shadlen and New- 
some, 2001; Roitman and Shadlen, 2002; Mazurek et al., 
2003). 



D. Continuous-time model 

We finally consider a model, in which time runs con- 
tinuously. This model has potential relevance to real-life 
networks. The responses of units satisfy the following 
equation 



dx(t) 
dt 



-Ax(t)+f)(t)+s(t). 



(31) 



The network connectivity matrix A can be related to 
connection matrix from the discrete-time model in i|19|) 



through C 



-At 



(see Appendix B). Noise is defined by 



its cross-correlation 



T»(*i)%(*2) = MjS(h - t 2 ), 



where 



Af - 



2 

m 



° ^ 



(32) 



(33) 



is a diagonal cross-correlogram of noise. Eqs. I|31|l - I|33|) 
are analogous to the discrete-time case l|19|l - (|21|l . Simi- 
larly, we define the output scalar and the decision variable 



y 



■ x(t) 



H{y). 



(34) 



Here t is time when the system makes the decision. 

Our model is thus defined by Eqs. i|3T)l - !j3l'|) . We will 
now use definition l|29|) to find decision makers. As in 
discrete-time case we need to know the variance of the 
output variable, <r 2 (y), after which 12911 leads to 



DMi = n 



2 9a 2 _(y) 
drft 



(35) 



Important for us is the time-dependent correlator 



X(t 1 ,t 2 )=x(t 1 )g r (t 2 ) 



(36) 



which we now evaluate. Solution of (|31|l is obtained using 
matrix exponentials 



x{t)= / dt'e Mt '- l) [ff(t) + s(t)] 



(37) 



If external stimulus is zero or a constant in time, due to 
the correlator at t\ > t 2 

t 2 

X(h,t 2 )= f dt'e^'-^Afe^V-^ (38) 



(39) 



We seek X(t\, t 2 ) in the form 

X = e A ^-^X , 



where Xq is equal-time cross-correlation. To find equa- 
tion for Xq we differentiate l|38|) as follows 



ax = i e A(t2-tx)x 



(40) 



We arrive thus to the following equation for Xq 

AX + XqA t = Af (41) 



This equation is the central tool for the continuous-time 
theory. The contributions to DM from each unit are 
found by differentiating c 2 (y) = v 1 Xqv with respect to 
noise, as in Eq. (|2*9")l 



(42) 



Once the same-time correlation matrix Xq is found from 
Eq. (|41|l . cross-correlation for arbitrary time is 

f e A(t 2 -ti) Y t > t 

This equation suggests a helpful strategy for determining 
noise matrix Al. Indeed, (|41|) and (|43|l imply that 



Af ■■ 



dX(t u t 2 ) 



dt x 



dX(h,t 2 ) 



tl=*2— £ 



(44) 



t 1 =t 2 +s 



Here e is infinitcsimally small positive number. In other 
words, noise matrix is equal to discontinuity in time- 
derivative of cross-correlation at t\ — t 2 . Since noise 
correlation matrix is diagonal, the non-zero elements are 



dX u (h,t 2 ) 



ti=t 2 -e 



dX u {t u t 2 ) 



(45) 



tl=t 2 +£ 
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Two comments are in order here. First, noise term 77(f) 
plays the role of input noise in i|31|) . It cannot be mea- 
sured directly. Equation l|44(l provides a way to single 
it out. Second, 1|44[) does not apply to the discrete-time 
model. Indeed, in the latter we either have t\ — t%, or 
t\ = t% ± 1, etc., i.e. the condition t\ = t2 ± e with e in- 
finitesimally small is hard to enforce. It may happen that 
£ w 1 is acceptable due to presence of slow components 
in the circuit, such as temporal integrators. However, in 
general case (|44f) cannot be applied to the discrete-time 
case. For instance, it fails dramatically for the case of 
'nematode' chain considered above. 

Equations lfll|l > , an d represent a useful set of 
tools to find DM components for various connectivities. 
We present here two possible cases, in which decision 
makers can be found. They differ in what is known about 
the system. 

Scenario 1: Assume we know the network connectiv- 
ity A, output metrics vector v, and autocorrelation for 
each unit Xn(t\,t2). The steps below allow finding the 
decision makers. 

1. Since noise matrix is diagonal, as per l|33|) . it can 
be found from autocorrelation using (|44|) . 

2. Solving (|41|) allows determining dXo/dATu, the 
derivative of equal-time crosscorrelation with re- 
spect to noise in each element. 

3. Decision makers are found from l|42|) . 

4. Normalize contributions to DM so that 
E DMi = 1. 

i 

Scenario 1 does not require simultaneous measure- 
ments from all units. It requires the knowledge of the 
network connectivity however. The next scenario is com- 
plimentary in this respect. 

Scenario 2: Suppose we have measured the full cross- 
correlation matrix X(t\ 1 t2) by simultaneous recordings 
from all units. Suppose also that we know how the out- 
put of the system is evaluated (vector v). These are the 
steps to determine DM units. 

1 . Use <|44[) to find noise matrix Jx. 

2. Use (|41|) to find the connection matrix A. 

3. Solve l|41(l to calculate dXo/dAfu for each element. 

4. Use (|42|) to find decision makers. 

5. Normalize contributions to DM so that 
EDMi = 1. 

i 

Both scenarios use extensive knowledge about the sys- 
tem, which renders them useless in experimental condi- 
tions. In the next subsection we discuss a way to bypass 
these limitations. 



Finally, we would like to provide solution to l(4*T|) using 
eigenbasis of matrix A. Since A is not necessarily sym- 
metric, a distinction should be made between right and 
left eigenvectors. The latter turn out to be useful for our 
purposes. They are defined by 

&A = X a g. (46) 

Here and below Greek indexes denote numbers of eigen- 
values, while Latin ones label spatial components of vec- 
tors and matrices. Solution of lj4"T|) is 

A 0y ■- 2^ A +A* ) aJ \ U J/^™"' 

(47) 

where 

G af3 =J2^ (48) 

i 

is the Gram matrix of eigenvectors. Eq. 147|) is valid 
if the eigenvectors form a complete basis in the N- 
dimensional space. As follows from l|47|) . eigenvalues of 
A with small real part contribute to DM in a large de- 
gree. This justifies the use of principal component anal- 
ysis when such eigenvalues are present. An example of 
such principal component is the temporal integrator loop 
in Figure 11B, which has vanishing A. 

In case if matrix A is symmetric, its eigenvalues are 
real and eigenvectors are orthogonal. This leads to a unit 
Gram matrix. Then, Eq. (3.29) becomes more compact 

= ^ ^fff^ Mmn (49) 

apmn r 

Similar equations, called Kubo formulas, are obtained for 
various correlators in case of diffusion of particles in ran- 
dom media (Efetov, 1997). The distinguishing feature 
of (|49|l is that a product of four eigenvectors enters the 
expression. Thus, propagation of noise in this case can 
be accompanied by interference between different path- 
ways. An example of destructive interference of this kind 
is given below, in section llVl 

Eq. (|49|) can be further simplified. Indeed, our model 
uses diagonal noise matrices, i.e. n = m in (|49() . Suppose 
also that the output from the network occurs through 
one exit element number i, which is specified by taking 
v = i{. In this case the use of Eq. l|4^|) gives 

^=^^ ^ . (50) 
t —i a q + As 

From this equation we conclude that for element n to 
contribute to DM, an eigenvector should exist, which is 
non-zero on both unit number n and exit unit i. Thus, 
we conclude that eigenvectors of A should be delocalized 
for broader impact of elements on the decision. This is 
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not surprising in view of the mentioned analogy with the 
diffusion problem. In case if matrix A is not symmetric, 
the Gram matrix may be non-diagonal and l|50|l cannot 
be used. However, the off-diagonal elements of G are usu- 
ally smaller than diagonal ones, due to uncorrelated sign 
changes, when (|48|l is computed with a ^ (3. Therefore, 
(|50|l may apply approximately. 



IV. ANALYSIS USING STIMULATION 

Stimulations with electric current add a new degree 
of freedom to DMA, thus leading to more effective ways 
of finding decision makers. There are two great advan- 
tages of the stimulation method. First, it only involves 
stimulation of a single neuron, therefore no simultaneous 
multiple-electrode measurements are required. Second, 
the knowledge of network connectivity is not needed to 
solve the problem. In this section we study our simple 
networks and find what stimulation strategies are consis- 
tent with our earlier definitions, such as Eq. (|14|) . 

We will use continuous model for concreteness (section 
IIII.Dfl . Consider the output variable y. It is a linear func- 
tion of the inputs. It is also a function, which contains 
noise components, variable from trial to trial. The noise 
components were acquired from all units in different de- 
gree. Since noise in each unit is gaussian, the output 
variable is described by gaussian distribution too 



p(y) 



1 



,-(y-y(s)) 2 l^\v) 



(51) 



In each trial a random value of y is obtained, according 
to distribution i|51|) . The response of the system is equal 
to 1 if y is positive, and otherwise. The probability to 
obtain response equal to 1 to given stimulus s is given by 
the error function (Abramowitz and Stegun, 1972) 



pi( s ) = / p{y) d y 



i 



1 + erf 



y{ s ) 
<y)V2 



(52) 



whereas the probability of zero response is 



Po(s) 



p(y)dy 



v(y)V2J 



(53) 



Both probabilities depend upon the mean response to 
stimulus y(s) and the standard deviation cr(y). There- 
fore the electric stimulation strategies may be based on 
affecting either the former or the latter. We now consider 
both of these strategies and show that affecting the mean 
response may provide misleading results, while chang- 
ing the variance of response allows estimating contribu- 
tions to DM consistently with our previous definitions. 
Thus, strategies of stimulation based on standard devi- 
ation of the output variable are always correct in our 
simple model, independently on the topology of the net- 
work. This may seem a trivial consequence of definition 



(|14(l , but we will discuss it here for the sake of comparison 
of two strategies and optimizing them. 

We start with the strategies of stimulation, which af- 
fect the mean response y(s). In our simple model this 
may be accomplished by injecting a tonic input current 
into a unit number i. Mathematically it is accomplished 
by adding extra stimulus Si to this unit in Eq. I|31|l . 
Note that in biological systems the stimulating current 
is alternating with constant amplitude (Salzman et al., 
1992). The mean response is shifted by the stimulation, 
i.e. 



os,. 



(54) 



where Si is the magnitude of injected tonic current. This 
leads to observable changes in the probability p± 



(55) 



Here Api(i) is the change in probability of correct re- 
sponses after unit number i is electrically stimulated. 
Can Api(i) be a measure of DM? 

We notice that Api(i) can be either positive or neg- 
ative. This depends on the sign of derivative dy/dsi, 
which is positive for excitatory pathway from unit i to 
the output and negative for inhibitory pathway. Since 
contribution to DM ought to be positive, we cannot as- 
sume simply that DMi ~ Api(i). The correct expres- 
sion, which we provide here without derivation is 



DMi ~ rfi [AM*)]' 



(56) 



This equation is understood in proportional sense, since 
DMi should be normalized to ensure that DMi = 1 ■ 

i 

Our investigations show that this expression is accurate 
for trees and is consistent with both our earlier definitions 
© or l|14l) . Remarkably, it employs quantities, which can 
be measured in a single-electrode experiment. Indeed, 
the amplitude of noise rjf can be found from autocorre- 
lation of unit's response, using (|44|l : and Api(i) is deter- 
mined from behavioral changes in response to single-unit 
stimulation. This equation thus provides an approach po- 
tentially useful in practice. Does this relationship work 
for networks of arbitrary connectivity? 

Figure 12B shows a counterexample, in which a unit 
is stimulated, which results in no change in probabil- 
ity of correct response (we consider trials in which 1 is 
the correct response throughout this section). This is 
because there are two pathways, leading from this unit 
to the exit, one positive and one negative. They have 
equal strength, and, therefore, compensate each other. 
On the other hand, unit number one does participate is 
DM, because if a non-stationary stimulation/stimulus is 
applied, its effect on the decision is not zero. Thus, l|56|) 
and tonic stimulation method cannot be applied to arbi- 
trary circuits, such as shown in Figure 12B, to accurately 
reveal decision makers. 
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FIG. 12 Finding decision makers using electric stimulation. 
A and B, tonic stimulation; C, random stimulation. A, tonic 
stimulation for trees results in shift in probability, which leads 
to correct estimation of decision making units. B, example 
of a circuit for which tonic stimulation leads to incorrect esti- 
mation of decision making, since it does not lead to the shift 
in probability. C, stimulation with a random current leads 
to correct estimate of decision making for networks with any 
connectivity. D, for optimal performance in random stimula- 
tion paradigm, the task should be set so that the probability 
of correct responses is close to p op timai ~ 0.84. 



Is there a stimulation method for finding decision mak- 
ing components in arbitrary networks? The method fol- 
lows directly from the definition ((14(1 [or 1)35(1 . which is 
equivalent]. Indeed, when stimulating current is a tem- 
poral white noise, the output variable y acquires a larger 
variance (Figure 12C). Hence, the derivative of output 
variance, entering l|35|) can be calculated operationally, 
by injecting a distracter current. More precisely, if the 



variance of stimulating current applied to unit i is s 2 the 
derivative entering definition (|35|) is 



ggfo) _ Aa^y) 



(57) 



In practice one has no access to the variable y, so one 
cannot measure directly the change in variance Acr 2 (y). 
Instead, one could measure the change in the probability 
of correct responses under the influence of distracting 
current. Indeed, from ((52(1 we obtain 



A Pl (i) = 



dpi 



Aa 2 (y) 



(58) 



Combining the last two equations we obtain for the im- 
portant derivative 



0a 2 jy) _ ApiW / dpi 
dr, 2 s 2 \da 2 (y) 



(59) 



Since the probability of correct responses always de- 
creases under the influence of distracters, the derivative 
dpi I da 2 (y) is a negative constant. It is the same for all 
units. We arrive therefore to the expression for contribu- 
tions to DM, which follows from 13511 



DMi 



rApi(i) 



(60) 



Here Api(i) is the decrease in probability of correct re- 
sponses produced by electric stimulation with variance 
of the random current equal to sf. The variance of noise 
on each unit rj 2 can be found from autocorrelation us- 
ing (|45|l . This procedure works for any topology in our 
simplified model. It should be noted here that if noise 
is not entirely white or cannot be considered white, l|45|) 
cannot be used directly and should be replaced by an 
expression reflecting the spectral characteristics of noise 
appropriate for the system under investigation. Thus, if 
noise is provided by other parts of the network, its dy- 
namic features may be more complex. Therefore, (|45f) 
may not apply directly to the 'hidden pathway' example 
given in the end of section lll.BI 

The procedure, which we just described, permits fur- 
ther optimization. Indeed, imagine that the probability 
of correct responses is exactly Adding distracting 
stimulation current will not change this probability, i.e. 
Api(i) = no matter what unit is stimulated. In the 
opposite limiting case when pi « 1, the effect of dis- 
tracter on performance is exponentially small. Hence, 
behavioral response to stimulation has an optimum be- 
tween p± = 1/2 and 1. To find the optimum we observe 
from l|58|) that Api is maximum for the same variation in 
Aa 2 (y) when dpi/ da 2 (y) is maximum. We therefore plot 
the latter derivative as a function of pi in Figure 12D. We 
indeed observe a maximum at the value of probability of 
correct responses close to 



Popti 



0.841 



(61) 
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To summarize, the following scenario describes algorithm 
for finding contributions to DM using random stimula- 
tion. 

Scenario 3: Assume that we do not know the network 
connectivity A and output metrics vector v; but we know 
autocorrelation for each unit Xu(ti, t 2 ). The steps below 
allow finding the decision makers. 

1 . Prepare stimulus so that the probability of correct 
responses is close to the value given by <|61|) . 

2. Stimulate one unit with random current, whose 
variance is s|, and measure the decrease in proba- 
bility of correct responses Api . 

3. Record autocorrelation and evaluate noise variance 
n\ for this unit using (|45|l . 

4. Find contribution to DM for this unit using equa- 
tion i(S0jL 

5. Repeat steps 1 through 4 for all units in the system. 

6. Normalize contributions to DM so that 
£DM< = 1. 



V. DISCUSSION 

In this work we defined decision makers in networks, 
which behave in a well-defined fashion. As with any 
definition, there is certain degree of arbitrariness in our 
study, since this is the first mathematical study of this 
sort. We had to make choices about the features of de- 
cision making we were attempting to describe as well as 
about the way they were quantified. We demonstrated 
these features in a set of examples. Future studies will 
show if these features can be used as a basis of a more 
complete model-independent theory. 

In this study we postulated that variability and noise, 
causally linked to decisions, are the chief descriptors of 
DM. Although this point may seem paradoxical we sug- 
gest three arguments in its favor. First, variability may 
reflect additional information needed to make a decision 
in case of uncertainty. Such may be inputs from other 
modalities, memories, or some other relevant modula- 
tory inputs, supplying e.g. emotional condition of the 
subject or changing utility values (Figure 8). Second, 
many behaviors, such as C-start escape responses in fish 
(Eaton and Emberley, 1991) and other organisms (Glim- 
cher, 2003), have stochastic character. This makes the 
task of pursuer more difficult. Such unpredictable be- 
haviors are reproduced in our model if the sensory input 
is weak or in the small signal-to-noise ratio case. Third, 
the goal of DM is to dissipate sensory information, as 
suggested in the introduction (Figure 1A), whereby an 
analog multi-dimensional stimulus space is reduced to a 
discrete space of several decisions. We argue that this 
transformation is facilitated by noise. 



We have studied the problem of finding decision mak- 
ing units in networks of various connectivities. This 
path took us from simple linear chains, for which the 
information-theoretical (IT) approach was found to be 
effective, to trees, and, finally to an alternative definition 
of decision makers, based on propagation of noise in net- 
works. This latter definition is valid in networks of arbi- 
trary topology. All these approaches are equivalent, when 
they can be compared, but include progressively broader 
classes of networks. As a practical application for the al- 
ternative definition we considered the problem of electric 
stimulation in the surrogate networks and showed a way 
of determining DM contributions for arbitrary networks 
using stimulation with random current. Our findings are 
summarized in Figures 13 and 14. 




FIG. 13 The cluster of problems covered in this study. 
Solid/dashed arrows show the derivations performed here/yet 
to be confirmed or denied. IT stands for information- 
theoretical. 

Although we studied networks of complex connectiv- 
ity, the model describing a single network element was 
quite simple. Not all of the units are linear, of course, 
since DM is a non-linear task (Figure 1A). However, our 
model is essentially based on linear elements. The moti- 
vation for this model is that it is easy to analyze. The 
study of simple models is a necessary step before analysis 
proceeds any further. Once the methodological issues are 
resolved for simpler models, complex non-linear systems 
can be studied in the same paradigm. One of impor- 
tant questions resolved here is that a completely linear 
element can be a decision maker, despite the presence 
of non-linear units in the network. Thus, nonlinearity is 
not a necessary attribute of DM. This question would be 
impossible to answer for more realistic system, since in 
practice all units contain nonlincarities. 

Decision making task, as formulated in Figure 1A, is 
similar to general object discrimination task. Represen- 
tation of motor response in our model is not distinguish- 
able mathematically from the representation of abstract 
object /decision category (Horwitz and Newsome, 1998; 
Shadlen and Newsome, 2001). The latter does not nec- 
essarily lead to a motor command. Thus, our analysis 
may uncover the identities of units responsible for cat- 
egorization of sensory inputs. In terms of this analysis 
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FIG. 14 Comparison between different approaches studied 
here. 



VI. CONCLUSION 

In this study we define network elements responsible 
for making decisions. We obtain two equivalent defini- 
tions. According to one, decisions are made by elements, 
in which correlations with the decision are first formed. 
According to the second definition, decision making ac- 
tivity is measured by the impact of variability in given 
unit on the response. We give examples of network mo- 
tifs, especially potent from decision making prospective, 
such as fan-out hubs and recurrent loops. The latter can 
function as temporal integrators of sensory inputs. We 
also study how electric stimulations can reveal decision 
making components. We conclude that stimulations with 
time-varying random current produce correct results for 
all network topologies. 



we emphasize the distinction between units representing 
the object category and the units in which this represen- 
tation is actually formed. The former are analogous to 
motor units in the decision task, while the latter are sim- 
ilar to decision makers. As follows from this study, the 
analysis is dependent upon the topology of the network 
involved. For simple linear sensory chains our conclusion 
is that the first unit, spatially or temporally, in which 
the representation of the object is correlated with final 
outcome of the discrimination process, is responsible for 
casting the stimulus in one of the abstract classes. In 
case of recurrent networks a more detailed quantitative 
analysis is needed to draw conclusions about identities of 
categorizing units. Thus, DMA may find a broader use in 
identifying units representing abstract object's percepts. 

A special care should be taken in distinguishing the 
DM task from the sensory discrimination task. It may 
occur that in the same experiment these tasks are per- 
formed by different populations of neurons. An example 
is given by (Salinas and Romo, 1998). They discovered a 
population of Ml neurons responding differentially to two 
categories of tactile stimuli. Some of these neurons did 
not respond, when the same behavior was guided by vi- 
sual cues. This observation is consistent with these neu- 
rons performing sensory discrimination of tactile stim- 
uli, while some other population making decisions about 
the actual motor response. Our mathematical analysis is 
general enough to include both of these functions. Thus, 
if correlations with motor response are studied, it will re- 
sult in the decision makers; while when the correlations 
with percepts are investigated, DMA should provide the 
identities of discriminating elements. 

We suggest that DMA may be relevant to other bi- 
ological systems. Possible applications may include the 
analysis of molecular networks, such as genetic regula- 
tory or protein binding networks; finding decision mak- 
ers in compartmcntal models of dentritic trees (Poirazi 
and Mel, 2001); studies of neural networks and structural 
networks of connectivities between different brain areas; 
and analysis of social networks. 



APPENDIX A: The linear chain model. 

Here we solve a more general version of linear chain 
model than considered in the text. The responses of 
neighboring neurons are related linearly 



(Al) 



This is a generalization of Q • The response of the nth 
unit is 



^ Otnirii 



(A2) 



where coefficients a n i — C n —\C n —2 ■ ■ ■ Cjj ot nn — 1. The 
external signal xq is assumed to be zero in this appendix, 
due to (JjjJ. For the last element in the chain we have 



x N 



N 

^UNiVi- (A3) 

Comparing (|A2|) and l|A3|) we conclude that 

xn = ct Nn x n + f , (A4) 

where £ is a variable, which describes noise in the net- 
works downstream from unit n. It is, thus, uncorrelated 
with x n . This is where tree-like topology enters our solu- 
tion, since in case of loops, x n and £ are correlated. Our 
goal now is to calculate MI between the decision variable 
d = H(xn) and x n . We will use the definition for MI 



MI(d, X n ) = ^ / dx nP (d, X n ) log 2 

.1 — n i " 



d=0,l 

Here p(d) — 1/2, since there is no signal 

_x , / — — \ 1/2 



p{d, X n ) 

p(d)p{x n 



(A5) 



p (x n ) = cxp (-x 2 n /2x n ) I (2-Kxi) (A6) 
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and 



p(d;x n ) 



P(x n ) 



l±erf 



(A7) 



The upper/lower sign is assumed for d = or 1 in i|A7|l : 
<r(£) is the standard deviation of Gaussian variable £ de- 
fined in (|A4() . The expression for MI gSJ results m 

MI n = M («„) 

OO 

M(s n ) = -± J dze-* 2 [1 + erf (zs n )} log 2 [1 + erf (««„)] 

— OO 

s„ = a{a Nn x n )/a{£). 

(A8) 

MI is therefore a function of signal-to- noise ratio s„. In- 
versely, 



a Nn X n 



^=[M-\MI n )\ 



(A9) 



2. In the 'loud' neuron example the contributions of units 
upstream from the strong link are larger by a factor of K 2 
than contribution from the downstream units. 

In this case tti,,,k = K, while otk+i...N = 1, assuming 
that the link from unit k to k + 1 is strengthened. In the 
example in the text k = 5 [cf. l(TU|l ]. Eq. (|A12ll leads us 
to the values for variances of responses 



<y-Nn x n \ n 2 K 2 k + r, 2 {n-k), n>k 
Applying (|A11J| we obtain the expression for F(MI) 

n < k 



< k 



(A14) 



K 2 n 



(A15) 



N - k + K 2 k ' 



n > k 



which is a piece-wise linear function of n. Eq. deter- 
mines contributions to DM as 



On the other hand, l|A4(l leads to 

x \ = a Nn x n + ^ 2 



(A10) 



Solving (|A9|1 and (|A10|) with respect to a^ n a;^ we have 



a Nn X n 



"AT 



l + [M-i(M/„)]' 



F(M/ n ) (All) 



Function M _1 here is inverse to M defined in l)A8|l . Func- 
tion F(MI) numerically calculated from (| A8|) and (|A11|) 
is shown in Figure 5. Lastly, we recall that variances 
°-Nn x n are related to the strength of noise rj 2 through 
Ll. We have 



K 2 



< k 



DM n = < N-kfK*k' 

N-k + K*k' n>k 



(A16) 



This confirms that the upstream units (n < k) are K 2 
times more potent than the downstream ones (n > k). 



3. Two definitions of contribution to DM using derivative 
of F(MI) {5) and the impact of noise (1141 are equivalent. 

Let us start by determining decision makers from def- 
inition {nj. According to (|A3I) 



N 

x n = E" 2 ^ 2 - 



(A17) 



(A12) Definition gives 



i=l 



Eqs. (|A"TT|) and l|AT2)l are used below to prove a variety 
of statements about function F(MI) used in the main 
text. 



DM t oc rj. 



—^dx 



2 U ^N _ 2 2 



After normalization we obtain 



(A18) 



1. In the uniform noise example F(MI) is a linear function 
of position in the chain. 

In this case C\ = . . . = Cn-i = 1, and, consequently, 
ajvi = ••• = &nn = 1- Noise variance is the same 
on every node, i.e. rjf = rj 2 . As follows from (|A12|) 
x n = T f" n i which results in 



F(MI n ) = n/N 



(A13) 



It follows that contributions to DM defined by are 
the same for all units. 



2 

X N 



(A19) 



Let us derive the same result from ©. As follows from 
(IaTH 



F{MI n ) - F(M/„_i) = 4= ( 

2 

_ a Nn (~2 C 2 ~2 \ 
— — \^ X n °n-l x ra-l ^ 



2 — 2 2 2 

a Nn x n ~ a N n-l X n-l 



2 „2 



AT 



(A20) 

This proves the equivalence of @ and i|14|) . since the 
result is identical to (|A19I) . 
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APPENDIX B: Connection between discrete- and 
continuous-time models. 

In this section we show that the discrete-time model 
can be derived from continuous-time model. Starting 
from equation l|37fl for the unit responses in the con- 
tinuous case we obtain the relation for solutions at two 
different time points separated by the time-interval r, 
analogous to l|19l) in the discrete-time description. Then 
we show that in the limiting case t — ► two descriptions 
are equivalent. 

From 

we obtain 

t + T 

x(t + t)= e' AT x{t) + J e- 1(t+r - t ' ) rf(i')^'. (Bl) 
t 

This equation can be rewritten as x(t+r) = Cx(t)+ff (t), 
where 

C = e- Ar ksI-At. (B2) 

Thus it has the same form as <|19[) . Using (|32[) we obtain 
that the new noise cross-correlation matrix 

r 

ST = J e- At 'tie- ATf 'dt' '. (B3) 
o 

The solution of the continuous-time problem satisfies the 
equations of the discrete-time model for an arbitrarily 
large time interval r, but the new noise cross-correlation 
matrix M' is non-diagonal in this case. In the limiting 
case t — ► it becomes diagonal. Indeed l)B3|) implies 
that in this limit 

N' =Nt , (B4) 

which is diagonal by the definition of the continuous- 
time model. Here we kept only terms linear in r. Thus, 
in this limit the matrix N' is diagonal as needed in our 
formulation of discrete-time model. One can also derive 
(|4*T)) from ||2HJ| using (|B2jl and (|B4|I and taking the limit 
t — > 0. 
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